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1 Introduction 

The fidelity criteria introduced in noisy and noiseless coding theorems may 
seem excessively stringent. The classical criterion, for example, requires that 
the probability of an error in the entire block approach zero as the block 
length goes to infinity. A code with a constant nonzero error rate per symbol 
would fail this test miserably (error probability would go to one in the large 
block limit), but could still be perfectly acceptable as long as the error rate 
was sufficiently small. (Most, if not all, noisy channel coding protocols used 
with real- world communications channels are examples.) Similarly in the 
quantum mechanical case, we might be willing — taking an i.i.d. source for 
simplicity in the example — to tolerate a constant rate of bad EPR pairs in 
the entanglement- transmission case, or a finite deviation ("distortion") of 
the average pure state fidelity of each transmission from one. A theory which 
tells us, given an "error rate" or level of distortion which we have decided we 
can tolerate, whether a given channel (noisy or noiseless) can achieve that 
error rate, would be decidedly useful. This is rate-distortion theory. 

One might think one could get by with substantially less resources if 
one accepts the less ambititious fidelity criterion of requiring a constant dis- 
tortion rate. Classical rate-distortion theory tells us that there is no great 
savings in allowing small average distortion rather than asymptotically per- 
fect transmission. Thus rate-distortion theory helps establish the relevance 
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of theoretical results like the asymptotic block-coding versions of noiseless 
and noisy channel coding, to real-world schemes. 

2 A quantum version of rate-distortion 

Let us use as our measure of distortion either one minus the entanglement 
fidelity (for the entanglement transmission problem) or one minus the average 
pure-state fidelity (for pure-state ensemble transmission problems). This 
must be evaluated for single transmissions, and then averaged over the block 
of n transmissions. I will confine myself to i.i.d. sources, with marginal 
density operator p, at least initially. Thus 

p (n) = p ®n_ 

The channel will 

be taken to be noiseless; then a (n, 2 nR ) rate- distortion code consists of a 
map £( n ) from n copies of the source space to n copies of a channel space 
of dimension 2 nR , followed by a decoding from n channels to n source 
spaces. The average distortion for an i.i.d. source can then be defined as: 

n i 

D e (£< n \vM) - F e (p,T^)), (1) 

i=i n 

where % is the "marginal operation" on the i-th copy of the source space 
induced by the overall operation o S n . More formally, 

% {n \a) = tr 0l) ... 0i _ 1)0i+1> ... >0 „[(pW o S n )(p ®p---®p®a®p---®p)}, (2) 

where the a in the input density operator is in the z-th position. (It is easily 
checked that this defines a tracepreserving operation.) The same definition, 
but with F(E,%^), as the fidelity criterion, defines the average pure-state 
distortion D. 

R is said to be the rate of a rate-distortion code. To avoid confusion, I 
note here that the rate of a rate-distortion code has a significance roughly 
inverse to that of the rate of information transmission through a noisy chan- 
nel. (The terminology is already well-established in classical information 
theory.) The rate in rate-distortion is the rate at which the source is de- 
scribed, that is, the number of qubits, or the log of the number of Hilbert 
space dimensions, used to encode the source, per source emission. Thus the 
goal of rate-distortion theory is to achieve low rates, i.e. to encode the source 
into as few qubits as possible per source emission. 
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A rate-distortion pair (R, D) is achievable for a given source iff there is a 
sequence of (n, 2 nR ) rate-distortion codes £>( n )) such that 

Urn D(S (n \V (n) ) < D. (3) 

Here D is whatever average distortion measure is used, e.g. D or D e . The 
rate- distortion feasible set for a source is the closure of the set of achievable 
rate-distortion pairs. The rate- distortion function R(D) is defined by 

R(D) = inf R\ (R, D)is achievable. (4) 

The rate- distortion frontier is the graph of the rate distortion function; the 
distortion-rate function is the inverse of the rate-distortion function. 

If we assume that the coherent information continues to play the role, in 
quantum information theory, of the mutual information in classical informa- 
tion theory, then we are led to define quantum analogues of the information 
rate-distortion function. 

The entanglement information rate- distortion function R\D) for a source 
is defined by: 

&(D)= mm I c (p, A). (5) 

A\d(A)<D 

One may conjecture that, as in the classical case, the information rate- 
distortion function just defined is equal to the information-disturbance func- 
tion defined above, and thus that -R 7 (-D) tells us the lowest rate at which we 
can use channel qubits to end a quantum source with entanglement distor- 
tion no greater than D. We might worry that peculiarly quantum features 
such as the superadditivity of the coherent information or the failure of data 
pipelining require some modifications to the straightforward quantum ana- 
logue of the classical result, as they do in the case of noisy channel coding. 
In what follows, I will derive a lower bound on the required description rate; 
perhaps this bound is not tight due to the peculiarly quantum effects just 
discussed, although the fact that general encodings are used in deriving the 
bound makes me doubt that the failure of data pipelining is relevant. I will 
not discuss achievability. I expect the techniques required for noisy channel 
coding may help in showing achievability, although rate-distortion may be 
more difficult as we cannot rely on bounds that only become tight for fideli- 
ties near one; the saving grace may be that the "noise"-like element is only 
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truncation to a smaller space, and this is likely to be much easier to deal 
with than a general channel operation. 

The proof I will give uses two lemmas. First, we need the convexity of 
the information rate-distortion function: 

Lemma 1 R r (D) is a nonincreasing, convex function of D; that is, 



where < A < 1 . 

Proof: Nondecrease: As D increases, the domain of the minimization 
in the definition of R J (D) becomes larger (or at least no smaller); therefore, 
R ! {D) does not increase. 

Convexity: Let (Ri,Di) and (R 2 ,D 2 ) be points on the information rate- 
distortion curve, and let £\ and £ 2 be operations achieving the minimum in 
the definition of R T {D) for D — Di and D = D 2 respectively. Consider the 
operation £\ = X£i + (1 — X)£ 2 . Since the entanglement disturbance is linear 
in the operation, this operation has disturbance D\ = D(£\) = \D(£i) + 
(1 — X)D(£ 2 ). Since R I {D\) is the minimum of the coherent information 
over operations, R I {D\) < I c (p,£\). And since the coherent information 
is convex in the operation, this is less than XI c (p,£i) + (1 — X)I c (p, £ 2 ) = 



Notice that the only property of the disturbance that was used in this 
proof was the linearity of the disturbance in the operation; hence it applies 
to any quantum rate-distortion function defined using a disturbance measure 
with this property, in particular to the information rate-distortion function 
using average pure-state fidelity. 

The second lemma we need is that the coherent information for a process 
on a composite state is greater than or equal to the total of the "marginal 
coherent informations" for the reductions of the process and the initial state 
to the subsytems. 

Lemma 2 




(6) 
(7) 



R I (XD 1 + (1 - \)D 2 ) < \R\Di) + (1 - \)R\D 2 ) 



\R I (D 1 ) + (1 - X)R I (D 2 ). 



I c (pW,£W)>^/ c ( ft) ^). 
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Here the definition of the reduced operation E\ is the same as that of % 
in (0), except that T>^ n > is omitted on the RHS. pi is of course the marginal 
density operator of the z-th system. Proof: The lemma obviously follows 
from the two-system case: 

/ c (p (2 U (2) ) > Upi,s[ 2) ) + I C ( P2 ,SP). (8) 

If we model this in the usual way, by purifying Qi into R\ and Q 2 into R 2 , 
adjoining an initially pure environment E and effecting the operation £^ by 
a unitary interaction U® 1 ® 2E , this becomes: 

S( p QiQi) _ s^pRiQiRzQ*) > s(p^) + S(p Q2 ) - S(p RlQl ) - S(p R2Q2 ) , (9) 

which may be rewritten 

S(p RlQl ) + S(p R2Q2 ) - S(p RlQlR * Q2 ) > S(p Ql ) + S(p Q2 ) - S(p QlQ2 ) . (10) 

The quantity appearing in this last form is the sum of the marginal entropies 
of two subsystems, minus the joint entropy of the composite system; it is 
a quantity which can be larger in quantum theory than it can in classical 
theory, due to entanglement. In this form, the inequality says that this 
excess of marginal over joint entropies is reduced if we ignore (trace over) 
parts of each of the subsystems. This follows from strong subadditivity, as 
we may show by rewriting it yet again as: 

S ( p RiQiR2Q^ + S ( p Qij + £( p Q 2 ) < S (jflrt») + S(p R ' Ql ) + S(p R2Q *) . (11) 

In this form, it follows from two applications of strong subadditivity (thanks 
to Michael Nielsen for this observation). We start with a case of strong 
subadditivity for the three systems Ri, Qi, and R 2 Q 2 : 

g^pRiQiRaQ*) + S ( p Qi) < S (p R ^) + S(p Q ' R2Q2 ) . (12) 
Adding S(p Q2 ) to both sides gives: 

S ( p RiQ,R2Q^ + £( p Qi) + 5(p Q 2) < S(pHiQi) + S(p QlR2Q2 ) + S(p Q2 ) . (13) 

The last two terms on the right hand side are then upper bounded by an- 
other application of strong subadditivity in the form S(p® lR2 ® 2 ) + S(p® 2 ) < 
S(p QlQ2 ) + S(p R2Q2 ) , giving (|TD . ■ 
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Theorem 1 Let P^) be a (2 nR ,n) rate- distortion code with distortion 
D. Then R > R^D). 

Proof: I give the proof as a chain of inequalities and equivalences, 
followed by notes justifying each inequality when possible. 

nR > S(p (nY ) (14) 

> S(pW)-S e (p 7 £W)=I c (pW,£W) (15) 

> / c (p (n) ,P (n) ofW) (16) 

> E J C (A-,4 n) ) (17) 

i 

> Y.R I (d(p^ ( i n) ))^^Y.-R I (d{p,£\ n) )) (18) 

i i ^ 

> nfl'EidO'.^Enfl'p). (19) 

(1T3) holds because ni? is the log of the dimension of an n-block of chan- 
nel Hilbert space, which constitutes an uppper bound to the von Neumann 
entropy of a density operator on that space. ([15]) follows from the positiv- 
ity of entropy exchange, (|l^) from the data processing inequality, (|17|) from 
Lemma |2|, the superadditivity of coherent information compared to marginal 
coherent information, ([18]) follows from the definition of the entanglement in- 
formation rate-distortion function, and ( |l~9|) from Lemma (|l|), the convexity 
of the rate-distortion function. ■ 
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